This chapter outlines the differences between applications and database based systems. It discusses basics regarding database structure and design theory. Application based approach to Event Processing (Define Here):
Application based approach is where each application collects and manages its own data in dedicated, separate files for each application - it concentrates on the process being performed Advantages
If a corruption of data occurs, everything may not be affected
Less costly to implement than the Database Approach
Disadvantages
Redundant data is stored in multiple files
Data integrity can be compromised when the same data is updated in one file and not the other files
Inefficient use of space
Can only query on what is in a specific table. There is no integration and hence queries are limited by information in specific tables
Data in files are not shareable across applications
Database Approach:
Information about events are stored in relational database tables instead of separate files. Advantages
Eliminates data redundancies
Improved data integrity
Facilitates the integration of business integration systems that includes data about all of a company's information in 1 collection of relational databases
With the use of querying, users can create customized reports to suit their needs
Disadvantages
If corruption occurs everything is corrupted
Expensive to implement
When more than one person attempts to use the same data at the same time, there could be contention of data
There can be disputes about who is responsible for maintaining the various files, since they are all shared
In order to control some of these issues a database administrator role may have to be established.
Distinguish between logical and physical database models
A logical database shows how the data is viewed by the user. This view gives the user instructions on how to retrieve data from the physical location. The physical database shows the actual structure of how the data is stored. These two models will most likely not be very similar in that the use of the logical view would be used by the management and the physical model would support the developers of the database software.
Data Redundancy - a disadvantage of the application approach, explain more fully the problems that can arise because of redundancy.
Increased storage space because the system must store and maintain multiple versions of the same data in multiple files.
Data integrity can be compromised if redundant data stored in multiple files are updated in some files and not other files.
People using the data for decision making purposes could receive very different results depending on which data they receive since the data stored in different places is rarely maintained simultaneously there can be both clerical and timing differences, thus the data integrity issues can cause erroneous decision making
Data Normalization:
Normalization is the process of organizing data in a database. This includes creating tables and establishing relationships between those tables according to rules designed both to protect the data and to make the database more flexible by eliminating redundancy and inconsistent dependency.
1st Normal Form (1NF) -Eliminate Repeating Groups : Make a separate table for each set of related attributes, and give each table a primary key. A table is in first normal form if it does not contain repeating groups.
Eliminate repeating groups in individual tables.
Create a separate table for each set of related data.
Identify each set of related data with a primary key.
can contain functional dependencies that can cause update anomalies
2nd Normal Form (2NF) -Eliminate Redundant Data : If an attribute depends on only part of a multi-valued key, remove it to a separate table. A table is in second normal form if it is in first normal form and has no partial dependencies; that is, no non-key attribute is dependent on only a portion of the primary key.
Create separate tables for sets of values that apply to multiple records.
Relate these tables with a foreign key.
Non-key attribute- Is not a part of the primary key.
can contain transitive dependencies that can cause update anomalies
3rd Normal Form (3NF) -Eliminate Columns Not Dependent On Key : If attributes do not contribute to a description of the key, remove them to a separate table. A table is in third order form if it is in second normal form and if it has no transitive dependencies.
Eliminate fields that do not depend on the key.
The goal of normalization is to produce a database model that contains relations that are in 3NF.
Normal forms are inclusive - (i.e. 3NF is in 1NF and in 2NF)
Transitive dependency- Exists in a table when a non-key attribute is functionally dependent on another non-key attribute.
BCNF - Boyce-Codd Normal Form : If there are non-trivial dependencies between candidate key attributes, separate them out into distinct tables.
4th Normal Form (4NF) - Isolate Independent Multiple Relationships : No table may contain two or more 1:n or n:m relationships that are not directly related.
5th Normal Form (5NF) - Isolate Semantically Related Multiple Relationships : There may be practical constrains on information that justify separating logically related many-to-many relationships.
Why is data normalization such an important design characteristic of databases?
Because it can cause errors in the data if data is added, changed, or deleted. Redundant data wastes disk space and creates maintenance problems. If data that exists in more than one place must be changed, the data must be changed in exactly the same way in all locations.
CLASSIFYING AND CODING OF DATA--5 Main Ways to Classify and Code Data
Sequential coding--Assigns numbers to data objects in chronological sequence. An invoice number is a good example of this. Block coding--Groups of numbers within a single ID number describe the objects being identified. A UPC is a good example of this, with one set of numbers describing a manufacturer, and another set explaining the specific product. Significant digit coding--Digits places within the items code can explain what product group it's in, the warehouse or area where it's stored, and the item itself. Many types of inventory are coded in this manner. Hierarchical coding--Again, the characters' positions in the code have meaning to what the item is. This type of coding lists the items where each succeeding number is a subset of the number before it. Postal codes use this type of coding. Mnemonic coding--Uses letters in the coding, usually to abbreviate descriptive words. This type of coding is used because it is easier for people to remember. Course numbers for college classes are a good example of this coding.
Database Management Systems:
This chapter outlines the differences between applications and database based systems. It discusses basics regarding database structure and design theory.Application based approach to Event Processing (Define Here):
Application based approach is where each application collects and manages its own data in dedicated, separate files for each application - it concentrates on the process being performed
Advantages
Disadvantages
Database Approach:
Information about events are stored in relational database tables instead of separate files.
Advantages
- Eliminates data redundancies
- Improved data integrity
- Facilitates the integration of business integration systems that includes data about all of a company's information in 1 collection of relational databases
- With the use of querying, users can create customized reports to suit their needs
DisadvantagesDistinguish between logical and physical database models
A logical database shows how the data is viewed by the user. This view gives the user instructions on how to retrieve data from the physical location. The physical database shows the actual structure of how the data is stored. These two models will most likely not be very similar in that the use of the logical view would be used by the management and the physical model would support the developers of the database software.
Data Redundancy - a disadvantage of the application approach, explain more fully the problems that can arise because of redundancy.
Data Normalization:
Normalization is the process of organizing data in a database. This includes creating tables and establishing relationships between those tables according to rules designed both to protect the data and to make the database more flexible by eliminating redundancy and inconsistent dependency.
Why is data normalization such an important design characteristic of databases?
Because it can cause errors in the data if data is added, changed, or deleted. Redundant data wastes disk space and creates maintenance problems. If data that exists in more than one place must be changed, the data must be changed in exactly the same way in all locations.
CLASSIFYING AND CODING OF DATA--5 Main Ways to Classify and Code Data
Sequential coding--Assigns numbers to data objects in chronological sequence. An invoice number is a good example of this.
Block coding--Groups of numbers within a single ID number describe the objects being identified. A UPC is a good example of this, with one set of numbers describing a manufacturer, and another set explaining the specific product.
Significant digit coding--Digits places within the items code can explain what product group it's in, the warehouse or area where it's stored, and the item itself. Many types of inventory are coded in this manner.
Hierarchical coding--Again, the characters' positions in the code have meaning to what the item is. This type of coding lists the items where each succeeding number is a subset of the number before it. Postal codes use this type of coding.
Mnemonic coding--Uses letters in the coding, usually to abbreviate descriptive words. This type of coding is used because it is easier for people to remember. Course numbers for college classes are a good example of this coding.